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Abstract. Since Val Tannen's pioneering work on the combination of 
simply- typed A-calculus and first-order rewriting [11], many authors have 
contributed to this subject by extending it to richer typed A-calculi and 
rewriting paradigms, culminating in the Calculus of Algebraic Construc- 
tions. These works provide theoretical foundations for type-theoretic 
proof assistants where functions and predicates are defined by oriented 
higher-order equations. This kind of definitions subsumes usual inductive 
definitions, is easier to write and provides more automation. 

On the other hand, checking that such user-defined rewrite rules, when 
combined with /3-reduction, are strongly normalizing and confiuent, and 
preserve the decidability of type-checking, is more difficult. Most ter- 
mination criteria rely on the term structure. In a previous work, we 
extended to dependent types and higher-order rewriting, the notion of 
"sized types" studied by several authors in the simpler framework of ML- 
like languages, and proved that it preserves strong normalization. 

The main contribution of the present paper is twofold. First, we prove 
that, in the Calculus of Algebraic Constructions with size annotations, 
the problems of type inference and type-checking are decidable, provided 
that the sets of constraints generated by size annotations are satisfiable 
and admit most general solutions. Second, we prove the latter proper- 
ties for a size algebra rich enough for capturing usual induction-based 
definitions and much more. 



1 Introduction 

The notion of "sized type" was first introduced in [21] and further studied by 
several authors [20,3,1,31] as a tool for proving the termination of ML-like 
function definitions. It is based on the semantics of inductive types as fixpoints 
of monotone operators, reachable by transfinite iteration. For instance, natural 
numbers are the limit of (S'i)i<w, where Si is the set of natural numbers smaller 
than i (inductive types with constructors having functional arguments require 
ordinals bigger than uj). The idea is then to refiect this in the syntax by adding 
size annotations on types indicating in which subset Si a term is. For instance, 
subtraction on natural numbers can be assigned the type — : nat" ^ nat^ 
nai", where a and fi are impHcitly universally quantified, meaning that the size 



of its output is not bigger than the size of its first argument. Then, one can 
ensure termination by restricting recursive calls to arguments whose size - by 
typing - is smaller. For instance, the following ML-like definition of f^frl- 

letrec div x y = match x with 
I -> 

I S x' -> S (div (x' - y) y) 

is terminating since, if x is of size at most a and y is of size at most /?, then x' 
is of size at most a — 1 and {x' — y) is of size at most a — 1 < a. 

The Calculus of Constructions (CC) [17] is a powerful type system with 
polymorphic and dependent types, allowing to encode higher-order logic. The 
Calculus of Algebraic Constructions (C AC) [8] is an extension of CC where func- 
tions are defined by higher-order rewrite rules. As shown in [10], it subsumes the 
Calculus of Inductive Constructions (CIC) [18] implemented in the Coq proof 
assistant [15], where functions arc defined by induction. Using rulc-bascd def- 
initions has numerous advantages over induction-based definitions: definitions 
are easier {e.g. Ackermann's function), more propositions can be proved equiv- 
alent automatically, one can add simplification rules like associativity or using 
rewriting modulo AC [6], etc. For proving that user-defined rules terminate when 
combined with /3-reduction, [8] essentially checks that recursive calls are made 
on structurally smaller arguments. 

In [7], we extended the notion of sized type to CAC, giving the Calculus of 
Algebraic Constructions with Size Annotations (CACSA). We proved that, when 
combined with /3-reduction, user-defined rules terminate essentially if recursive 
calls are made on arguments whose size - by typing - is strictly smaller, by 
possibly using lexicographic and multiset comparisons. Hence, the following rule- 
based definition of [^frl- 

0/2/ ^ 
{sx) / y s {{x - y) / y) 

is terminating since, in the last rule, if x is of size at most a and y is of size 
at most /3, then (s x) is of size at most a + 1 and {x — y) is of size at most 
a < a + 1. Note that this rewrite system cannot be proved terminating by 
criteria only based on the term structure, like RPO or its extensions to higher- 
order terms [22, 29]. Note also that, if a term t is structurally smaller than a term 
u, then the size of t is smaller than the size of u. Therefore, CACSA proves the 
termination of any induction-based definition like CIC / Coq, but also definitions 
like the previous one. To our knowledge, this is the most powerful termination 
criterion for functions with polymorphic and dependent types like in Coq. The 
reader can find other convincing examples in [7]. 

However, [7] left an important question open. For the termination criterion to 
work, we need to make sure that size annotations assigned to function symbols 
are valid. For instance, if subtraction is assigned the type — : noi" ^ naifl 
not" , then we must make sure that the definition of — indeed outputs a term 
whose size is not greater than the size of its first argument. This amounts to 



check that, for every rule in the definition of — , the size of the right hand-side 
is not greater than the size of the left hand-side. This can be easily verified by 
hand if, for instance, the definition of — is as follows: 

- a; ^ 
a; — — > a; 
{s x) - (s y) -»• X - y 

The purpose of the present work is to prove that this can be done automat- 
ically, by inferring the size of both the left and right hand-sides, and checking 
that the former is smaller than the latter. 



Fig. 1. Insertion sort on polymorphic and dependent lists 

nil : {A : -k)list°'A 
cons : {A : -k)A =J> (n : nat)list°'A n => list"" A (sn) 
if _m_then_else : bool => {A : •k)A ^ A ^ A 

insert : {A : •k){<: A ^ A ^ bool) A =^ {n : nat)list°'A n => list""' A (sn) 
sort : {A : ★)(<: A^ A^ bool){n : nat)list"A n list" A n 



if true in A then u else v 
if false in A then u else v 
insert A < x _ {nil _) 
insert A < x _ {cons _y n I) 



sort A < _ {nil _) 
sort A < _ {cons _ x n I) 



u 

V 



cons A x Q {nil A) 

if X < y in list A (s (s n)) 

then cons A x {s ri) {cons A y n I) 

else cons A y {s n) {^insert A < x n I) 

nil A 

insert A < x n {sort A < n I) 



We now give an example with dependent and polymorphic types. Let ★ be 
the sort of types and list : ★ => nat ^ ★ be the type of polymorphic lists of fixed 
length whose constructors are nil and cons. Without ambiguity, s is used for the 
successor function both on terms and on size expressions. The functions insert 
and sort defined in Figure 1 have size annotations satisfying our termination 
criterion. The point is that sort preserves the size of its list argument and thus 
can be safely used in recursive calls. Checking this automatically is the goal of 
this work. 

An important point is that the ordering naturally associated with size anno- 
tations implies some subtyping relation on types. The combination of subtyping 
and dependent types (without rewriting) is a difficult subject which has been 
studied by Chen [12]. We reused many ideas and techniques of his work for 
designing CACSA and proving important properties like /3-subject reduction 
(preservation of typing under /3- reduction) [5]. 

Another important point is related to the meaning of type inference. In ML, 
type inference means computing a type of a term in which the types of free and 
bound variables, and function symbols (letrec's in ML), are unknown. In other 
words, it consists in finding a simple type for a pure A-term. Here, type inference 
means computing a CACSA type, hence dependent and polymorphic (CACSA 



contains Girard's system F), of a term in which the types and size annotations of 
free and bound variables, and function symbols, are known. In dependent type 
theories, this kind of type inference is necessary for type-checking [16]. In other 
words, we do not try to infer relations between the sizes of the arguments of a 
function and the size of its output like in [13,4]. We try to check that, with the 
annotated types declared by the user for its function symbols, rules satisfy the 
termination criterion described in [7]. 

Moreover, in ML, type inference amounts to solve equality constraints in 
the type algebra. Here, type inference amounts to solve equality and ordering 
constraints in the size algebra. The point is that the ordering on size expressions 
is not anti-symmetric: it is a quasi-ordering. Thus, we have a combination of 
unification and symbolic quasi-ordering constraint solving. 

Finally, because of the combination of subtyping and dependent typing, the 
decidability of type-checking requires the existence of minimal types [12]. Thus, 
we must also prove that a satisfiable set of equality and ordering constraints has 
a smallest solution, which is not the case in general. This is in contrast with 
non-dependently typed frameworks. 

Outline. In Section 2, we define terms and types, and study some properties 
of the size ordering. In Section 3, we give a general type inference algorithm and 
prove its correctness and completeness under general assumptions on constraint 
solving. Finally, in Section 4, we prove that these assumptions are fulfilled for the 
size algebra introduced in [3] which, although simple, is rich enough for capturing 
usual inductive definitions and much more, as shown by the first example above. 
Missing proofs are given in [9]. 

2 Terms and types 

Size algebra. Inductive types are annotated by size expressions from the fol- 
lowing algebra A: 

a ::= a \ sa \ oo 

where a € Z is a size variable. The set A is equipped with the quasi-ordering 
<A defined in Figure 2. Let = <a H >^ be its associated equivalence. 

Let (p,Tp,p, . . . denote size substitutions, i.e. functions from Z to A. One can 
easily check that <^ is stable by substitution: if a <^ b then aip <_a bip. We 
extend <a to substitutions: (p <_a V- iff) for all a E Z, nip <^ a.ib. 

We also extend the notion of "more general substitution" from unification 
theory as follows: (p is more general than ip, written <p ^ ip, iS there is ip' such 
that pp' <A ip- 

Terms. We assume the reader familiar with typed A-calcuH [2] and rewriting 
[19]. Details on CAC(SA) can be found in [8, 7]. We assume given a set 5 = □} 
of sorts is the sort of types and propositions; □ is the sort of predicate types), 
a set J-" of function or predicate symbols, a set CJ-"° C .F of constant predicate 
symbols, and an infinite set X of term variables. The set T of terms is: 



Fig. 2. Ordering on size expressions 



(refl) a <a a 



(trans) 



a <A b b <A c 



a <A c 



(mon) 



a <Ab 



(succ) 



a <Ab 



(infty) a <a oo 



sa <A sb 



a <A sb 



t-.— slxlC \ f \ [x:t]t\{x:t)t\ tt 



where s e S, x G X, C G CJ^° , a & A and f & T\ CT° . A term [x : t\u is 
an abstraction. A term (x : T)C/ is a dependent product, simply written T =^ U 
when X does not occur in U. Let t denote a sequence of terms ti, . . . , t„ of length 



Every term variable x is equipped with a sort s^; and, as usual, terms 
equivalent modulo sort-preserving renaming of bound variables are identified. 
Let V{t) be the set of size variables in t, and FV(<:) be the set of term vari- 
ables free in t. Let 6',(T, ... denote term substitutions, i.e. functions from X 
to T. For our previous examples, we have CJ^° = {nat, list, bool} and = 
CT° U {0, s, /, nil, cons, insert, sort). 

Rewriting. Terms only built from variables and symbol applications ft are 
said to be algebraic. We assume given a set TZ of rewrite rules I r such that 
/ is algebraic, / = fl with / ^ CJ^° and FV(r) C FV(/). Note that, while left 
hand-sides are algebraic and thus require syntactic matching only, right hand- 
sides may have abstractions and products. /3-reduction and rewriting are defined 
as usual: C[[x : T]u v] — >/3 C[u{x i— > v}] and C[la] -^n C[ra\ if I ^ r € TZ. Let 
— > = — >^ U — >7^ and be its refiexive and transitive closure. Let t i uiS there 
exists V such that t v *^ u. 

Typing. We assume that every symbol / is equipped with a sort s/ and a 
type Tf = (x : T)U such that, for all rules fl-^re 7^, \l\ < \T\ (/ is not applied 
to more arguments than the number of arguments given by r/). Let J-^ (resp. 
X^) be the set of symbols (resp. variables) of sort s. As usual, we distinguish 
the following classes of terms where t is any term: 

- objects: o ::= x € X* \ f G J='* \ [x : t]o \ ot 

- predicates: p ::= x G X° \ C G \ f G T° \ CT° \ [x : t]p \ (x : t)p \ pt 

- kinds: K ::= -k \ {x : t)K 

Examples of objects are the constructors of inductive types 0, s, nil, cons, . . . 
and the function symbols — , /, insert, sort, . . .. Their types are predicates: induc- 
tive types bool, nat, list, . . ., logical connectors A, V, . . ., universal quantifications 
{x : T)U, . . . The types of predicates are kinds: ★ for types Hke bool or nat, 
•k =J> nat ★ for list, . . . 

An environment J' is a sequence of variable-term pairs. An environment is 
valid if a term is typable in it. The typing rules of CACSA arc given in Figure 4 
and its subtyping rules in Figure 3. In (symb), is an arbitrary size substitution. 
This refiects the fact that, in type declarations, size variables are impHcitly 



\t\=n. 



universally quantified, like in ML. In contrast with [12], subtyping uses no sorting 
judgment. This simplification is justified in [5]. 

In comparison with [5], we added the side condition V(t) = in (size). It 
does not affect the properties proved in [5] and ensures that the size ordering 
is compatible with subtyping (Lemma 2). By the way, one could think of tak- 
ing the more general rule C"t < C'^u with t ~_4 u. This would eliminate the 
need for equality constraints and thus simplify a little bit the constraint solving 
procedure. More generally, one could think in taking into account the monotony 
of type constructors by having, for instance, list naf^ < list nat^ whenever 
a <yi b. This requires extensions to Chen's work [12] and proofs of many non 
trivial properties of [5] again, like Theorem 1 below or subject reduction for /3. 

Fig. 3. Subtyping rules 
(refl) T<T (size) CH < CH (C G CJ^° , o <a b, V{t) = 0) 
U' <U V <V' T' <U' , , 

^^'^"^ i.:U~)v<ix7u')V' rtrr (Ti^'^u'iu) 

T <U U <V 



(ax) h 5*: : □ (prod) 



Fig. 4. Typing rules 

ri-?7:s r,x:U\-V -.s' 



r\-{x: U)V : s' 



(si^e) ^na ' ° {CeCJ^,aeA) (symb) if^CT") 

("""^^ r,x:T^x:T (^^dom(r)) (weak) r,x:U^t:T (^^dom(r)) 

r,x:U\-v:V rh{x:U)V:s F \- t : {x : U)V F \- u : U 

r ^ [x : U]v: {x : U)V ^^PP-* r^tu:V{x^ u} 

r\-t:Tr\-T':s 
(-b) j,^, (T<T') 

c»-Terms. An oo-term is a term whose only size annotations are oo. In 
particular, it has no size variable. An oo- environment is an environment made 
of oo-terms. This class of terms is isomorphic to the class of (unannotated) CAC 
terms. Our goal is to be able to infer annotated types for these terms, by using 
the size annotations given in the type declarations of constructors and function 
symbols 0, s, /, nil, cons, insert, sort, . . . 

Since size variables are intended to occur in object type declarations only, 
and since we do not want matching to depend on size annotations, we assume 



that rules and type declarations of predicate symbols nat, bool, list, . . . are made 
of oo-terms. As a consequence, we have: 

Lemma 1. - If t t' then, for all tp, tip ^-ji t'lp. 
-Ifr\-t:T then, for all ip, rip\-tip: Tip. 

We make three important assumptions: 

(1) TZ preserves typing: for &\\ I ^ r £ TZ, F , T and cr, if _r h Ztr : T then 
_r h rcr : T. It is generally not too difficult to check this by hand. However, 
as already mentioned in [7], finding sufficient conditions for this to hold in 
general does not seem trivial. 

(2) /3L\TZ is confluent. This is for instance the case if TZ is confluent and left- 
linear [24], or if /? U 7?, is terminating and TZ is locally confluent. 

(3) /3UTZ is terminating. In [7] , it is proved that /3 U 7^ is terminating essentially 
if, in every rule fl—>^r€TZ, recursive calls in r are made on terms whose 
size - by typing - are smaller than I, by using lexicographic and multiset 
comparisons. Note that, with type-level rewriting, confluence is necessary 
for proving termination [8] . 

Important remark. One may think that there is some vicious circle here: we 
assume the termination for proving the decidability of type-checking, while type- 
checking is used for proving termination! The point is that termination checks 
are done incrementally. At the beginning, we can check that some set of rewrite 
rules TZi is terminating in the system with (3 only. Indeed, we do not need to use 
TZi in the type conversion rule (conv) for typing the terms of TZi. Then, we can 
check in /3 U TZi that some new set of rules 7Z2 is terminating, and so on. . . 

A^arious properties of CACSA have already been studied in [5]. We refer the 
reader to this paper if necessary. For the moment, we just mention two important 
and non trivial properties based on Chen's work on subtyping with dependent 
types [12]: subject reduction for /3 and transitivity elimination: 

Theorem 1 ([5]). T < U iff Tl <g Ui, where <s is the restriction of < to 
(refl), (size) and (prod). 

We now give some properties of the size and substitution orderings. Let — >^ 
be the confluent and terminating relation on A generated by the rule soo — > oo. 

Lemma 2. Let aj, be the normal form of a w.r.t. —>a- 

- ac^Ab iffai= bl. 

- If oo <_4 a or s^^^a <a a then a.l= oo. 
^ If d <A b and (p <a then aip <a bip. 

- If ip <Ai^ andU <V then U(p < Vtp. 

Note that cxD-terms are in ^-normal form. The last property (compatibility 
of size ordering wrt subtyping) follows from the restriction V{t) = in (size). 



3 Decidability of typing 



In this section, we prove the decidability of type inference and type-checking for 
oo-terms under general assumptions that will be proved in Section 4. We begin 
with some informal explanations. 

How to do type inference? The critical cases are (symb) and (app) . In (symb) , 
a symbol / can be typed by any instance of t/, and two different instances may be 
necessary for typing a single term {e.g. s{sx)). For type inference, it is therefore 
necessary to type / by its most general type, namely a renaming of r/ with fresh 
variables, and to instantiate it later when necessary. 

Assume now that wc want to infer the type of an application fAi. naturally 
try to infer a type for t and a type for u using distinct fresh variables. Assume that 
we get T and U' respectively. Then, tu is typable if there is a size substitution 
if and a product type (x : P)Q such that Tip < {x : P)Q and U'^p < P. 

After Theorem 1, checking whether A < B amounts to check whether Al <g 
Bl, and checking whether A <s B amounts to apply the (prod) rule as much 
as possible and then to check that (rcfl) or (size) holds. Hence, Tip < (x : P)Q 
only if T[ is a product. Thus, the application tu is typable if TJ, = (a; : U)V and 
there exists ^p such that U'lip <« Uip. Finding ip such that Aip B^p amounts 
to apply the (prod) rule on A <s i? as much as possible and then to find p such 
that (refl) or (size) holds. So, a subtyping problem can be transformed into a 
constraint problem on size variables. 

We make this precise by first defining the constraints that can be generated. 

Definition 1 (Constraints). Constraint problems are defined as follows: 

C::=_L|T|CAC|a = 6|a<6 

where a,b G A, = is commutative, A is associative and commutative, C AC = 
C AT = C and C A _L = _L. A finite conjunction Ci A . . . A C„ is identified with T 
if n = 0. A constraint problem is in canonical form if it is neither of the form 
Cat, nor of the form CAT, nor of the form C AC AT>. In the following, we 
always assume that constraint problems are in canonical form, inequality (resp. 
inequality^ problem is a problem having only equalities (resp. inequalities). An 
inequality oo < a is called an oo- inequality. An inequality s^a < s'/3 is called a 
linear inequality. Solutions to constraint problems are defined as follows: 

- S{T) = 0, 

- S{T) is the set of all size substitutions, 

- S{CAV) = S{C)r\S{V), 

- S{a = b) = {ip \ a(p = b(p}, 

- S{a < b) = {ip \ aip <_4 bip}. 

Let S^{C) — {(p I Va, aipi ^ oo} be the set o/ linear solutions. 

We now prove that a subtyping problem can be transformed into constraints. 

Lemma 3. Let S{U,V) be the set of substitutions (p such that Utp <s Vtp. We 
have S{U, V) = S{C{U, V)) where C{U, V) is defined as follows: 



- C{{x : U)V, {x : U')V') = C{U', U) A C{V, V), 

- C{C''u,C^v) = a < b A £°{ui,vi) A . . . A £°{un,Vn) if \u\ = \v\ =n, 

- C{U, V) = £^{U, V) in the other cases, 
and £^{U, V) is defined as follows: 

- S'{{x:U)V,{x:U')V') = £'{[x:U]V,[x:U']V') = £'{UV,U'V') 
= £'{U,U') A£'{V,V'), 

- £^{C'',C^)^a^b, 
-£"iC'',C"') = a = bAoo<a, 

- r(c,c) = T ifceSUXu:F\CT'', 

- £^{U, V) = J- in the other cases. 

Proof First, we clearly have ip G S{£^{U, V)) iff Uip = Vip, and ip e S{£^{U, V)) 
iff Uip = Vip and V{Uip) = 0. Thus, S{U, V) = S{C{U, V)). □ 



Fig. 5. Type inference rules 
(ax) r * : n (prod) - " 



r\{ {x: U)V : s' 

(size) r\lc^:Tc {CeCJ^) (symb) r\lf:Tfp^ if^CJ^) 



(var) r\lx:xr (xedom(r)) (abs) " ' ; (V ^ □ 

r hi, [x : U\v : {x : U)V 



(app) 



rh. t-T i':.. u : if' {Tl = {x:U)V,C = C{U'l,U), 

rh^ tu:Vippy{x^u} S{C)^<ll,ip = mgs{C)) 



For renaming symbol types with variables outside some finite set of already 
used variables, we assume given a function p which, to every finite set y C Z, 
associates an injection /O^ from y to Z\y. In Figure 5, we define a type inference 
algorithm parametrized by a finite set y of (already used) variables under the 
following assumptions: 

(1) It is decidable whether S{C) is empty or not. 

(2) If S{C)^% then C has a most general solution mgs{C). 

(3) If S{C) ^ then mgs{C) is computable. 

It would be interesting to try to give a modular presentation of type inference 
by clearly separating constraint generation from constraint solving, as it is done 
for ML in [25] for instance. However, for dealing with dependent types, one 
at least needs higher-order pattern unification. Indeed, assume that we have a 
constraint generation algorithm which, for a term t and a type (meta-)variable 
X, computes a set C of constraints on X whose solutions provide valid instances 
of X, i.e. valid types for t. Then, in (app), if the constraint generation gives 
Ci for t : Y and C2 for u : Z, then it should give something like Ci A C2 A 
{3U.3V. Y=0r, {x : U)Vx A Z < U A X =0r,Vu) for tu:X. 



We now prove the correctness, completeness and minimality of , assuming 
that symbol types are well sorted (h r/ : s/ for all /). 

Theorem 2 (Correctness). If F is a valid oo- environment and F ^r^^ t : T , 
then F\-t:T,t is an co-term and V(T) 0^ = 0. 

y 

Proof. By induction on . We only detail the (app) case. 

(app) By induction hypothesis, F \- t : T, F \- u : U' and t and u are cxo-terms. 
Thus, tu is an c»-term. By Lemma 1, F \- t : Tip and F \- u : U'lp. Since 

T(pl= {x : Uip)V(p, we have Tip ^ □ and F h Tp : s. By subject reduction, 
r K (x : U(p)V(p : s. Hence, by (sub), F \- t : {x : Uip)Vp. By Lemma 3, 
5(C) = S{U'i, U) and U'iip <, U^p. Since FhU^p: s', by (sub),V ^u:Up. 
Therefore, by (app), F \- tu : Vip{x i— > u} and F \- tu : Vippy{x i— > u} since 
V(w) = 0. □ 

Theorem 3 (Completeness and minimality). If F is an oo- environment, t 
is an oo-term and F \- t : T, then there are T' and 4^ such that F t : T' and 
T'i) < T. 

Proof. By induction on h. We only detail some cases, 
(symb) Take T' = TfPy and ■0 = Py^^- 

(app) By induction hypothesis, there exist T, ijji, U' and ip2 such that F 
t : T, TiPi < {x : U)V, F ^{"'"■'^^ u : U' and U'ij}2 < U. By Lemma 2, 
V{U') n V(T) = 0. Thus, dom{ipi) n dom(V'2) = 0. So, let ip = ipib) ip2- By 
Lemma 1, Ti^ {x : Ul)Vl. Thus, Tl = [x : Ui)Vi, Ui < Ui^j and Vi^j < 
Vi- Since U'tp < U and Ul < Uitp, we have U'i ip < Uitp and, by Lemma 1, 
U'i V <s UiTp. Thus, V e S{U'l,Ui). By Lemma 3, S{U'l,Ui) = 8(0) with 
C = C{U'i,Ui)- Thus, S{C) ^ and there exists p = mgs{C). Hence, F \{ 
tu : Vippy0 where 9 = {x i-^ u}. We are left to prove that there exists (p' such 
that VippyOp' < V6. Since p = mgs{C), there exists ijj' such that ifijj' <a tp- 
So, let ip' = p~^4'' ■ Since V(u) = 9, commutes with size substitutions. Since 
Vi^ <Vl<V, by Lemma 2, VippyOp' = Vipij'0 < Viij0 < VO. □ 

Theorem 4 (Decidability of type-checking). Let F he an oo- environment, 
t he an oo-term and T he a type such that F \- T : s. Then, the prohlem of 
knowing whether there is V' such that F \- t : Ttp is decidahle. 

Proof. The decision procedure consists in (1) trying to compute the type T' 
such that -T t : T' hy taking y = V{T), and (2) trying to compute ip = 
mgs{C{T' ,T)). Every step is decidable. 

We prove its correctness. Assume that F t : T', y = V(T) and ip = 
mgs{C{T',T)). Then, TV < Tip and, by Theorem 2, F^t:T'. By Lemma 1, 
F\-t: T'ip. Thus, by (sub), F^t:Tip. 

We now prove its completeness. Assume that there is ip such that F \- t : Tip. 
Let y = V(T). Since F is valid and V(r) = 0, by Theorem 3, there are T' and 
(p such that F \^ t : T' and T'pi < Tip. This means that the decision procedure 
cannot fail {ip^ifG S{T', T)). □ 



4 Solving constraints 



In this section, we prove that the satisfiability of constraint problems is decidable, 
and that a satisfiable problem has a smallest solution. The proof is organized 
as follows. First, we introduce simplification rules for equalities similar to usual 
unification procedures (Lemma 4). Second, we introduce simplification rules for 
inequalities (Lemma 5). From that, we can deduce some general result on the 
form of solutions (Lemma 7) . Wc then prove that a conjunction of inequalities has 
always a linear solution (Lemma 8). Then, by using linear algebra techniques, 
we prove that a satisfiable inequality problem has always a smallest solution 
(Lemma 11). Finally, all these results are combined in Theorem 5 for proving 
the assumptions of Section 3. 

Let a state S be ± or a triplet £\£'\C where £ and £' are conjunctions of 
equalities and C a conjunction of inequalities. Let S{L) = and S{£\£'\C) = 
S{£ A £' A C) be the solutions of a state. A conjunction of equalities £ is in 
solved form if it is of the form ai = ai A . . . A an = a-n (n > 0) with the 
variables ai distinct from one another and V(a) fl {ex} = 0. It is identified with 
the substitution {a i— > a}. 



Fig. 6. Simplification rules for equalities 



(1) £ Asa = sb 


£' 


C 




£Aa = b 


£'\C 


(2) £ Aa^a 


£' 


c 




£\£' \C 




(3) £Aa = s''+^a 


£' 


c 




_L 




(4) 5 A oo = s'^+^o 


£' 


c 




± 




(5) £ Aa = a 


£' 


c 






£'{a. 



The simplification rules on equalities given in Figure 6 correspond to the usual 
simplification rules for first-order unification [19], except that substitutions are 
propagated into the inequalities. 

Lemma 4. The relation of Figure 6 terminates and preserves solutions: if§i 
S2 then ^(Si) = S{Ei2). Moreover, any normal form of £\T\C is either _L or of 
the form T\£'\C' with £' in solved form and V(C') n dom{£') = 0. 

We now introduce a notion of graphs due to Pratt [26] that allows us to detect 
the variables that are equivalent to 00. In the following, we use other standard 
techniques from graph combinatorics and linear algebra. Note however that we 
apply them on symbolic constraints, while they are generally used on numerical 
constraints. What we are looking for is substitutions, not numerical solutions. 
In particular, we do not have the constant in size expressions (although it 
could be added without having to change many things). Yet, for proving that 
satisfiable problems have most general solutions, we will use some isomorphism 
between symbolic solutions and numerical ones (sec Lemma 10). 

Definition 2 (Dependency graph). To a conjunction of linear inequalities 
C, we associate a graph Gc on V{C) as follows. To every constraint s^a < s'^/3, 



we associate the labeled edge a — > /3. The cost of a path ai — > . . . — > ak+i is 
S^-iPi. A cyclic path fi.e. when ak+i = ct\) is increasing if its cost is > 0. 

Fig. 7. Simplification rules for inequalities 

(1) CAa< s'^oo C 

(2) eAV-^CA{oo<a\a€ V(X>)} if G-p is increasing 

(3) C A s'^oo < s^a C{a i-^oo}Aoo<a ifae V(C) 

A conjunction of inequalities C is in reduced form if it is of the form Coo A Ci 
with Coo a conjunction of oo-inequalities, Ce a conjunction of linear inequalities 
with no increasing cycle, and V(Coo) H V(C^) = 0. 

Lemma 5. The relation of Figure 7 on inequality problems terminates and pre- 
serves solutions. Moreover, any normal form is in reduced form. 

Lemma 6. If C is a conjunction of inequalities then S{C) ^ ^. Moreover, if C 
is a conjunction of oo-inequalities then S{C) = | Vq € V(C),a<^J,= oo}. 

Lemma 7. Assume that S\T\C has normal form T\£'\C' by the rules of Figure 
6, and C has normal form V by the rules of Figure 7. Then, S{£ AC) ^ 0, 
£' = mgs{£) and every ^ e S{£ A C) is of the form £'{v tbi tp) with v G 5(X>(x)) 
and ip e SiVe). 

Proof. The fact that, in this case, S{£) ^ and £' — mgs{£) is a well known 
result on unification [19]. Since S{£ AC) = S{£' A V), V{£') n V{V) = and 
S{V) 7^ 0, we have S{£ AC) ^ 0. Furthermore, every ip ^ S{£ AC) \s of the form 
£'ip' since S{£' AV) C Si£'). Now, since V(Poo) n V(P^) = 0, y.' = u W ^ with 
V G S(V^) and ip G S{Vi). □ 

Hence, the solutions of a constraint problem can be obtained from the solu- 
tions of the equalities, which is a simple first-order unification problem, and from 
the solutions of the linear inequalities resulting of the previous simplifications. 

In the following, let C be a conjunction of K linear inequalities with no 
increasing cycle, and L be the biggest label in absolute value in Gc. We first 
prove that C has always a linear solution by using Bellman-Ford's algorithm. 

Lemma 8. 5^(C) ^ 0. 

Proof. Let succ{a) = {/3 | a [3 G Gc} and succ* be the refiexive and 
transitive closure of succ. Choose j G Z \ V(C), a set R of vertices in Gc such 
that succ*{R) covers Gc, and a minimal cost > KL for every j3 € R. Let 
the cost of a vertex ak+i along a path ai a2 . . . cck+i with ai € R 
be + S^-iPi. Now, let be the maximal cost for (3 along all the possible 
paths from a vertex in R. We have CiJp > since there is no increasing cycle. 

Hence, for all edge a /? G Gc, we have oja + P < <^f3- Thus, the substitution 
ip={a^s'^<"y\aeV{C)}GS\C). □ 



We now prove that any solution has a more general linear solution. This 
impHes that inequality problems are always satisfiable and that the satisfiability 
of a constraint problem only depends on its equalities. 

Lemma 9. If f G S{C) then there exists tjj G S^{C) such that tjj <a 

We now prove that S^{C) has a smallest element. To this end, assume that 
inequalities are ordered and that V(C) = {ai, . . . , an}- We associate to C an 
adjacency-hke matrix M = {rrnj) with K Hues and n columns, and a vector 
V = (vi) of length K as follows. Assume that the i-th inequality of C is of the 
form s^aj < s'^ak- Then, rriij = 1, rrii^k = —1, mi^i = if / ^ {j,k}, and 
Vi = q- p. Let P = {ze([r \ 'Mz <v,z'> 0} and P' = Pn Z". 

To a substitution ip G S^{C), we associate the vector z"^ such that zf is the 
natural number p such that ai(p = 

To a vector z S P' , we associate a substitution (fz as follows. Let {Gi, . • . , Gs} 
be the connected components of Ge- For all i, let c; be the component number 
to which a, belongs. Let . . . be variables distinct from one another and 
not in V(C). We define aiipz = s^'-pa- 

We then study the relations between symboHc and numerical solutions. 

Lemma 10. 

-If(p€ S^{C) then e P' . Furthermore, if (p <a v' then z^ < z"^' . 

- If z G P' then (pz G S^{C). Furthermore, if z < z' then (pz <a fz'- 

- z"^' = z and ipzv E f- 

Finally, we are left to prove that P' has a smallest element. The proof uses 
techniques from Hnear algebra. 

Lemma 11. There is a unique z* e P' such that, for all z & P' , z* < z. 

An efficient algorithm for computing the smallest solution of a set of linear 
inequalities with at most two variables per inequality can be found in [23]. A 
more efficient algorithm can perhaps be obtained by taking into account the 
specificities of our problems. 

Gathering all the previous results, we get the decidability. 

Theorem 5 (Decidability). Let C be a constraint problem. Whether S{C) is 

empty or not can be decided in polynomial time w.r.t. the size of equalities in C. 
Furthermore, if S{C) ^ then S(C) has a smallest solution that is computable 
in polynomial time w.r.t. the size of inequalities. 

5 Conclusion and related works 

In Section 3, we give a general algorithm for type inference with size annotations 
based on constraint solving, that docs not depend on the size algebra. For having 
completeness, we require satisfiable sets of constraints to have a computable most 
general solution. In Section 4, we prove that this is the case if the size algebra is 



built from the symbols s and oo which, although simple, captures usual inductive 
definitions (since then the size corresponds to the number of constructors) and 
much more (see the introduction and [7]). 

A natural extension would be to add the symbol + in the size algebra, for 
typing list concatenation in a more precise way for instance. We think that the 
techniques used in the present work can cope with this extension. However, with- 
out restrictions on symbol types, one may get constraints Hke 1 < a+P and loose 
the unicity of the smallest solution. We think that simple and general restric- 
tions can be found to avoid such constraints to appear. Now, if symbols like x 
are added to the size algebra, then wc lose linearity and need more sophisticated 
mathematical tools. 

The point is that, because we consider dependent types and sub typing, we are 
not only interested in satisfiability but also in minimality and unicity, in order 
to have completeness of type inference [12]. There exist many works on type 
inference and constraint solving. We only mention some that we found more or 
less close to ours: Zenger's indexed types [32], Xi's Dependent^ ML [30], Odersky 
et al 's ML with constrained types [25], Abel's sized types [1], and Barthe et al 's 
staged types [4]. We note the following difi"erences: 

Terms. Except [4], the previously cited works consider A-terms a la Curry, 
i.e. without types in A-abstractions. Instead, we consider A-terms a la Church, 
i.e. with types in A-abstractions. Note that type inference with A-terms a la 
Curry and polymorphic or dependent types is not decidable. Furthermore, they 
all consider functions defined by fixpoint and matching on constructors. Instead, 
we consider functions defined by rewrite rules with matching both on constructor 
and defined symbols {e.g. associativity and distributivity rules). 

Types. If we disregard constraints attached to types, they consider simple 
or polymorphic types, and we consider fully polymorphic and dependent types. 
Now, our data type constructors carry no constraints: constraints only come up 
from type inference. On the other hand, the constructors of Zenger's indexed 
data types must satisfy polynomial equations, and Xi's index variables can be 
assigned boolean propositions that must be satisfiable in some given model (e.g. 
Presburger arithmetic). ExpHcit constraints allow a more precise typing and 
more function definitions to be accepted. For instance (see [7]), in order for 
quicksort to have type lisf^ list"' , we need the auxiliary pivot function to have 
type nat°° => list" => list'^xlist'^ with the constraint a = And, if quicksort 

has type li.st°° list°° then a rule like / {cons x I) g x (/ {quicksort I)) is 
rejected since {quicksort I) cannot be proved to be smaller than {cons x I). The 
same holds in [1,4]. 

Constraints. In contrast with Xi and Odersky et al who consider the con- 
straint system as a parameter, giving DML(C) and HM(X) respectively, we con- 
sider a fixed constraint system, namely the one introduced in [3]. It is close to 
the one considered by Abel whose size algebra docs not have oo but whose typos 
have expHcit bounded quantifications. Inductive types are indeed interpreted 
in the same way. We already mentioned also that Zenger considers polynomial 



^ By "dependent", Xi means constrained types, not full dependent types. 



equations. However, his equivalence on types is defined in such a way that, for 
instance, list" is equivalent to list'^", which is not very natural. So, the next 
step in our work would be to consider exphcit constraints from an abstract 
constraint system. By doing so, Odersky et al get general results on the com- 
pleteness of inference. Sulzmann [28] gets more general results by switching to 
a fully constrained-based approach. In this approach, completeness is achieved 
if every constraint can be represented by a type. With term-based inference and 
dependent types, which is our case, completeness requires minimality which is 
not always possible [12]. 

Constraint solving. In [4], Bartlie et al consider system F with ML- like 
definitions and the same size annotations. Since they have no dependent type, 
they only have inequality constraints. They also use dependancy graphs for elim- 
inating oo, and give a specific algorithm for finding the most general solution. 
But they do not study the relations between linear constraints and linear pro- 
gramming. So, their algorithm is less cfRcicnt than [23], and cannot be extended 
to size annotations like a + b, for typing addition or concatenation. 

Inference of size annotations. As already mentioned in the introduction, 
we do not infer size annotations for function symbols Hke [13,4]. We just check 
that function definitions are valid wrt size annotations, and that they preserve 
termination. However, finding annotations that satisfy these conditions can eas- 
ily be expressed as a constraint problem. Thus, the techniques used in this paper 
can certainly be extended for inferring size annotations too. For instance, if we 
take — : nai" nat^ nat^ , the rules of — given in the introduction are valid 
whenever < X, a < X and X < X, and the most general solution of this 
constraint problem is X = a. 

Acknowledgments. I would like to thank very much Miki Hermann, Hong- 
wei Xi, Christophe Ringeissen and Andreas Abel for their comments on a pre- 
vious version of this paper. 
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Proofs 

5.1 Remark about constraint solving 

One could think of using Comon's work [14] but it is not possible for several 
reasons: 

- We consider two kinds of constraints: equality constraints a = b where = is 
interpreted by the syntactic equality, and inequality constraints a < b where 
< is interpreted by the quasi-ordering <^ on size expressions. Instead of large 
inequalities, Comon considers strict inequalities a < b where < is interpreted 
by the lexicographic path ordering (LPO). Since <a is a quasi-ordering, we 
do not have a <^ b a <^ by a = b. 

- Even though one can get rid of oo symbols in a first step, thing that we do in 
Lemmas 7 and 9, Comon assumes that there is at least one constant symbol. 
Indeed, he studies the ground solutions of a boolean combination of equations 
and inequations. However, without oo, we have no ground term. It does not 
matter since we do not restrict ourself to ground solutions. 

5.2 Proof of Lemma 4 

The relation ^ strictly decreases the measure {s{£),c{£))iex where s{£) is the 
number of constraints and c{£) the number of symbols. Its correctness is easily 
checked. Now, let S = £\£'\C' be a normal form of f |T|C. If f ^ T then S is 
reducible. Now, one can easily check that, if £i|5(|Ci £2\£2\C2, £[ is in solved 
form and V(Ci)ndom(£:() = 0, then £!^ is in solved form and V(C2)ndom(£'^) = 0. 
So, £' is in solved form and V(C') n dom(£:') = 0. 

5.3 Proof of Lemma 5 

The relation strictly decreases the measure (c(C), v{C))iex where c(C) is the num- 
ber of symbols and variables and v{C) the multiset of occurrences of each vari- 
able in C. We now prove the correctness of these rules. (1) is trivial. (3) follows 
from Lemma 2. For (2), let V = /\{oo < a \ a € V(I?)}. We clearly have 
S{V') C S{V). Assume that Gv = ai ^ . . . ^ ai and G S{V). If ai6i= oo 



then, for all i, ai9i= oo and 9 € S{V'). Otherwise, there exist 7 G Z and, for 
all i, mj e N such that UiO = s'"'7, nii + pi < m2, . ■ . , nik + Pk < Thus, 
^i=iPi ^ ■S'iLiTOi. Hence, i^^LiPi < which is not possible since Gx> 
is increasing. Finally, a normal form is clearly in reduced form. 

5.4 Proof of Lemma 6 

Let S = {^p \ ya € V{C),aipi= 00}. We prove that S C 5(C). Let ip = {a i--> 
00 I a e V(C)} and a < b e C. \Ne have a = s^a' and h = s^b' with a', 6' e 
2^ U {00}. So, by Lemma 2, a<p = s^oo <^ = s'oo and 1^ € ^(C). 

Assume now that C is a conjunction of 00- inequalities. Let ip S S{C) and 
a G V(C). Since a € V(C), there exists a constraint 00 < a in C. Thus, by 
Lemma 2, a(pi= 00 and (f G S. 

5.5 Proof of Lemma 9 

We can assume w.l.o.g. that dom((p) C V(C). If, for all a G V(C), a(pi= 00, 
then any € S'^(C) 7^ works. Otherwise, there exists a G V(C), 7 and p 
such that atp = 5^7. W.l.o.g., we can assume that C has only one connected 
component. Let = {a G dom{(p) \ aip 00}, = dom((^) \ and 
= {/? G I < s«/3 e C => atpi?^ 00}. For every a G De, let be 
the integer k such that aip = s'^j. Let Ci = {5^0; < s*/3 | aiplj^ oo,f]iflyt 00}, 
C2 = {sPa < sip I aifl^ oo,f3ipl= 00}, C3 = {s^a < s"?/? | a(/)i= oo,/3(pi= 00} 
and = C3 W {/3 < /3 I /3 e 1?;^}. We have C = Ci W C2 W C3. After the proof of 
Lemma 8, by taking R I5 D'^ and (7^ = max{KL, LUa + P — Q \ s^P G C} 

for every /? G D;^, there exists tp' G S\C^). We have dom((^') = V(C^) = Doo- 
Let V = y'lrif W 'P'- We clearly have V hnear and tp f- We now prove that 
G S\C). We have Vlv(Ci) = <^>\v{c^) G ^(Ci) and i>\v{c^) = V^'lvcc,) e •S'CCa). 
Let now s^a < s'^P G C2. We must check that s^a^p < s'^P^p' . It follows from the 
definition of ip' . 

5.6 Proof of Lemma 10 

- Assume that the z-th inequality is of the form s^aj < s'^ak- We must prove 
that — z^ < q—p. By assumption, s^ajip <^ s'^a.k'p. Hence, p+^;J < 

The second claim is immediate. 

- Assume that the i-th inequality is of the form s^aj < s^afe. We must prove 
that sPajipz <x s'^ak'Pz, that is, s^+^J/Scj <a s'^~^^''Pck - Since aj and ak are 
connected in Gc, Cj = Cfc. And, by assumption, Zj — Zk < q — P- 

- zf" is the integer p such that aiipz = s^P, and aitpz — s^^Pa - Thus, p — Zi. 

- a.iPz^ = s^'' Pc-, and zf is the integer p such that aiip = s^p. Every variable 
of a connected component c is mapped by tp to the same variable 7c. Let ip 
be the substitution which associates 7c to Pc- We have ai(pzvtp = s^P^'ip = 
sPja = onip. Thus, ipzf E 



5.7 Proof of Lemma 11 



Lemma 11 is Lemma 12 (6) below. 

See for instance [27] for details on polyhedrons, i.e. sets of the form {z S 

Q" I < v}. Note that P = {0 G Q" | M'z < v'} with M' = ( ^] and 



, where / is the identity matrix. We say that a bit vector is a vector 

whose components are in {0, 1}. Given two vectors and z'', min{z"', z''} is the 
vector z such that Zi = min{zf, z^}. 

Lemma 12. 

(1) P is pointed, i.e. his lineality space {z€:Q"'\M'z = 0} has dimension 0. 

(2) P is integral, i.e. P is the convex hull of P' . 

(3) P is infinite. 

(4) Every minimal proper face of P has for direction a hit vector. 

(5) If z^ &P then miniz", z''} e P. 

(6) There is a unique z* e P' such that, for all z G P' , z* < z. 
Proof (1) If M'z = then -Iz = and z = 0. 

(2) P is integral since the transpose of M is totally unimodular: it is a {0, ±1}- 
matrix with in each column exactly one +1 and one —1 ([27] p. 274). 

(3) As any polyhedron, there is a polytope Q such that P = Q + char.cone{P) 
([27] p. 88), where char.cone{P) = {z G | M'z < 0} is the characteristic 
cone of P. Since every row of M has exactly one +1 and one —1, the sum of 
the columns of M is 0. Thus, the vector 1 whose components are all equal to 
1 belongs to char.cone{P) and, either P = or P is infinite. After Lemma 
8, 5^(C) ^ 0. Thus, P is infinite. 

(4) For every minimal proper face F of P, there exist a row submatrix (L u) of 
(M' v') and two rows (a* and (a-' v'j) of (M' v') such that rank{L) = 
rank{M') - 1 and P = {z e Q" | Lz = u,*a'z < v'i,*a^z < v'j} ([27] p. 
105). The direction of F is given by Ker{L) = {z e Q" \ Lz = 0}. Let be 
the unit vector such that = 1 and = if z 7^ j. Since rank{M') = n, 
rank{L) = n — 1 and there exists k < n such that {Le^ | j ^ fc} is a family 

of linearly independent vectors. Thus, A'' = ^tgfe^ singular. Let w = 

^ COTfl(^ A^) 

N~'^e^. If Lz = then Nz = zue^ and z = z^w. We have N~'^ = — — - — — 

det{N) 

where *com{N) is the transpose matrix of the cofactors of N. Now, one can 
easily prove that, if every row (or column) of a matrix U is either 0, ±e^ 
or - with j ^ k, then det{U) G {0,±1}. Hence, det{N) = ±1 and 
ti; is a {0, ±l}-vector. The equations satisfied by z in iz = are either 
Zi = or Zi = Zj. If there is no equation involving Zi then Ker{L) = Qe* 
and w = ±e\ Otherwise, w > or w < 0. Since w can be replaced by —w 
w.l.o.g, w can always be defined as a bit vector. 



(5) Let z = min{z'^, z''}. If 2" < z^ or z^ < z", this is immediate. Assume now 
that there are i j such that zf < z^ and z" > z^. Since every minimal 
proper face of P has for direction a bit vector, we must have z e P. 

(6) Let c = min{lz \ z & P}, F = {z e P \ Iz = c}, z* e F a.nd z € P. 
Assume that z* ^ z. Then, z' = min{z*,z} G P and Iz' < Iz*, which is 
not possible. Thus, z* < z and F = {z*}. Now, since P is integral, z* e P' . 

□ 

5.8 Proof of Theorem 5 

We can assume that C ^ _L. Let C~ be the equalities of C and C- be the 
inequalities of C Compute the normal form of C~|T|C- w.r.t. the rules of Figure 
6. This can be done in polynomial time w.r.t. the size of equalities. If the normal 
form is _L then S{C) = and we are done. Otherwise, it is of the form T|f |r>. Let 
Voo ^T>e be the normal form of V w.r.t. the rules of Figure 7. It can be computed 
in polynomial time w.r.t. the size of constraints. Let P = {z G Q" | M'z < v'} 
where M' and v' are the matrix and the vector associated to Dg. Compute 
c = min{lz | z G P} and z* E {z G _P | Iz = c}. This can be done in polynomial 
time w.r.t. the size of constraints since P is integral (see [27] p. 232). Finally, 
let mgs{C) = £{v l±l ipz*) where v G S{'D^). We prove that this is the smallest 
solution. 

Let ip G S{C). By Lemma 7, ^ = £{v'^ip') where v' G S{V^) and ip' G S{Vi). 
By Lemma 9, there exists V' € S^iVg) such that V E ^' ■ By Lemma 10, z'^ G P' . 
By Lemma 11, z* < z^ . By Lemma 10, ipz* E fz*- By Lemma 10, E V'- 
Thus, ifz' E f' and mgs{C) C (p since v ~^ v' . 



